Why averaging classifiers can protect against overfitting
نویسندگان
چکیده
We study a simple learning algorithm for binary classification. Instead of predicting with the best hypothesis in the hypothesis class, this algorithm predicts with a weighted average of all hypotheses, weighted exponentially with respect to their training error. We show that the prediction of this algorithm is much more stable than the prediction of an algorithm that predicts with the best hypothesis. By allowing the algorithm to abstain from predicting on some examples, we show that the predictions it makes when it does not abstain are very reliable. Finally, we show that the probability that the algorithm abstains is comparable to the generalization error of the best hypothesis in the class.
منابع مشابه
Bayesian Conditional Gaussian Network Classifiers with Applications to Mass Spectra Classification
Classifiers based on probabilistic graphical models are very effective. In continuous domains, maximum likelihood is usually used to assess the predictions of those classifiers. When data is scarce, this can easily lead to overfitting. In any probabilistic setting, Bayesian averaging (BA) provides theoretically optimal predictions and is known to be robust to overfitting. In this work we introd...
متن کاملBayesian Model Averaging with Cross-Validated Models
Several variants of Bayesian Model Averaging (BMA) are described and evaluated on a model library of heterogeneous classifiers, and compared to other classifier combination methods. In particular, embedded cross-validation is investigated as a technique for reducing overfitting in BMA.
متن کاملBayesian Averaging of Classifiers and the Overfitting Problem
Although Bayesian model averaging is theoretically the optimal method for combining learned models, it has seen very little use in machine learning. In this paper we study its application to combining rule sets, and compare it with bagging and partitioning, two popular but more ad hoc alternatives. Our experiments show that, surprisingly, Bayesian model averaging’s error rates are consistently ...
متن کاملSelecting One Dependency Estimators in Bayesian Network Using Different MDL Scores and Overfitting Criterion
The Averaged One Dependency Estimator (AODE) is integrated all possible Super-Parent-One-Dependency Estimators (SPODEs) and estimates class conditional probabilities by averaging them. In an AODE network some redundant SPODEs maybe result in some bias of classifiers, as a consequence, it could reduce the classification accuracy substantially. In this paper, a kind of MDL metrics is used to sele...
متن کاملCompression-Based Averaging of Selective Naive Bayes Classifiers
The naive Bayes classifier has proved to be very effective on many real data applications. Its performance usually benefits from an accurate estimation of univariate conditional probabilities and from variable selection. However, although variable selection is a desirable feature, it is prone to overfitting. In this paper, we introduce a Bayesian regularization technique to select the most prob...
متن کامل